Improved MLP structures for data-driven feature extraction for ASR
نویسندگان
چکیده
In this paper, we present our recent progress on multi-layer perceptron (MLP) based data-driven feature extraction using improved MLP structures. Four-layer MLPs are used in this study. Different signal processing methods are applied before the input layer of the MLP. We show that the first hidden layer of a four-layer MLP is able to detect some basic patterns from the time-frequency plane. KLT-based dimension reduction along time is applied as a modulation frequency filter. The new feature extraction was tested on a large vocabulary continuous speech recognition (LVCSR) task using the NIST 2001 evaluation set. We achieved 11.6% relative word error rate (WER) reduction compared to the traditional PLP-based baseline feature. This is also a significant improvement compared to our previously published results on the same task using MLP-based features with three-layer MLPs.
منابع مشابه
Reverse Correlation for Analyzing MLP Posterior Features in ASR
In this work, we investigate the reverse correlation technique for analyzing posterior feature extraction using an multilayered perceptron trained on multi-resolution RASTA (MRASTA) features. The filter bank in MRASTA feature extraction is motivated by human auditory modeling. The MLP is trained based on an error criterion and is purely data driven. In this work, we analyze the functionality of...
متن کاملAutomatic data selection for MLP-based feature extraction for ASR
The use of huge databases in ASR has become an important source of ASR system improvements in the last years. However, their use demands an increase of the computational resources necessary to train the recognizers. Several techniques have been proposed in the literature with the purpose of making a better use of these enormous databases by selecting the most ‘informative‘ portions and thus red...
متن کاملMultilingual hierarchical MRASTA features for ASR
Recently, a multilingual Multi Layer Perceptron (MLP) training method was introduced without having to explicitly map the phonetic units of multiple languages to a common set. This paper further investigates this method using bottleneck (BN) tandem connectionist acoustic modeling for four high-resourced languages — English, French, German, and Polish. Aiming at the improvement of already existi...
متن کامل(Deep) Neural Networks
This work continues in development of the recently proposed Bottle-Neck features for ASR. A five-layers MLP used in bottleneck feature extraction allows to obtai arbitrary feature size without dimensionality reduction by transforms, independently on the MLP training targets. The MLP topology – number and sizes of layers, suitable training targets, the impact of output feature transforms, the ne...
متن کاملData-derived Nonlinear Mapping for Feature Extraction in Hmm
Rather long temporal trajectory of critical band logarithmic power spectrum energy at a given frequency is used as an input feature vector in a MLP-based phoneme classi er, trained on a task-independent hand-labeled development data. Class-speci c log likelihood vectors from the individual sub-classi ers form input to a merging MLP classi er trained on the training data. Output of this merging ...
متن کامل